Weakly Supervised Learning for Hedge Classification in Scientific Literature
نویسندگان
چکیده
We investigate automatic classification of speculative language (‘hedging’), in biomedical text using weakly supervised machine learning. Our contributions include a precise description of the task with annotation guidelines, analysis and discussion, a probabilistic weakly supervised learning model, and experimental evaluation of the methods presented. We show that hedge classification is feasible using weakly supervised ML, and point toward avenues for future research.
منابع مشابه
Exploring hedge identification in biomedical literature
We investigate automatic identification of speculative language, or 'hedging', in scientific literature from the biomedical domain. Our contributions include a precise description of the task including annotation guidelines, theoretical analysis and discussion. We show that good agreement can be achieved using our guidelines and present a publicly available benchmark dataset for the task. We ar...
متن کاملHedge Classification in Biomedical Texts with a Weakly Supervised Selection of Keywords
Since facts or statements in a hedge or negated context typically appear as false positives, the proper handling of these language phenomena is of great importance in biomedical text mining. In this paper we demonstrate the importance of hedge classification experimentally in two real life scenarios, namely the ICD9-CM coding of radiology reports and gene name Entity Extraction from scientific ...
متن کاملWeakly supervised learning of information structure of scientific abstracts - is it accurate enough to benefit real-world tasks in biomedicine?
MOTIVATION Many practical tasks in biomedicine require accessing specific types of information in scientific literature; e.g. information about the methods, results or conclusions of the study in question. Several approaches have been developed to identify such information in scientific journal articles. The best of these have yielded promising results and proved useful for biomedical text mini...
متن کاملHedgeHunter: A System for Hedge Detection and Uncertainty Classification
With the dramatic growth of scientific publishing, Information Extraction (IE) systems are becoming an increasingly important tool for large scale data analysis. Hedge detection and uncertainty classification are important components of a high precision IE system. This paper describes a two part supervised system which classifies words as hedge or nonhedged and sentences as certain or uncertain...
متن کاملA Weakly-supervised Approach to Argumentative Zoning of Scientific Documents
Argumentative Zoning (AZ) – analysis of the argumentative structure of a scientific paper – has proved useful for a number of information access tasks. Current approaches to AZ rely on supervised machine learning (ML). Requiring large amounts of annotated data, these approaches are expensive to develop and port to different domains and tasks. A potential solution to this problem is to use weakl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007